Springfield
Immigration Agents Are Killing and Abusing People. So Civilians Are Turning to a Controversial Tool to Find Justice.
Users Civilians Are Using A.I. to Unmask ICE Agents. Websites like ICEList are attempting to hold federal agents accountable--but it's unclear whether they make the system safer or more dangerous. After federal immigration officers shot Alex Pretti in Minneapolis, social media users called for the unmasking of the agents responsible. On X, users shared photos of the agents involved. It didn't take long before A.I.-generated pictures made their appearance: One user posted a seemingly deepfaked picture of a masked ICE agent, writing, "This is one of the soulless lowlife ghouls who executed Alex Pretti in cold blood!
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.62)
- North America > United States > New York (0.05)
- North America > United States > Missouri > Greene County > Springfield (0.05)
- (2 more...)
Multiscale Grassmann Manifolds for Single-Cell Data Analysis
Wang, Xiang Xiang, Cottrell, Sean, Wei, Guo-Wei
Single-cell data analysis seeks to characterize cellular heterogeneity based on high-dimensional gene expression profiles. Conventional approaches represent each cell as a vector in Euclidean space, which limits their ability to capture intrinsic correlations and multiscale geometric structures. We propose a multiscale framework based on Grassmann manifolds that integrates machine learning with subspace geometry for single-cell data analysis. By generating embeddings under multiple representation scales, the framework combines their features from different geometric views into a unified Grassmann manifold. A power-based scale sampling function is introduced to control the selection of scales and balance in- formation across resolutions. Experiments on nine benchmark single-cell RNA-seq datasets demonstrate that the proposed approach effectively preserves meaningful structures and provides stable clustering performance, particularly for small to medium-sized datasets. These results suggest that Grassmann manifolds offer a coherent and informative foundation for analyzing single cell data.
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- North America > United States > New York (0.04)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- Telecommunications (1.00)
- Information Technology > Networks (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area (0.93)
Variational Geometry-aware Neural Network based Method for Solving High-dimensional Diffeomorphic Mapping Problems
Li, Zhiwen, Ho, Cheuk Hin, Lui, Lok Ming
Traditional methods for high-dimensional diffeomorphic mapping often struggle with the curse of dimensionality. We propose a mesh-free learning framework designed for $n$-dimensional mapping problems, seamlessly combining variational principles with quasi-conformal theory. Our approach ensures accurate, bijective mappings by regulating conformality distortion and volume distortion, enabling robust control over deformation quality. The framework is inherently compatible with gradient-based optimization and neural network architectures, making it highly flexible and scalable to higher-dimensional settings. Numerical experiments on both synthetic and real-world medical image data validate the accuracy, robustness, and effectiveness of the proposed method in complex registration scenarios.
- Asia > China > Hong Kong (0.04)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > British Columbia (0.04)
From Memorization to Generalization: Fine-Tuning Large Language Models for Biomedical Term-to-Identifier Normalization
Pericharla, Suswitha, Hier, Daniel B., Obafemi-Ajayi, Tayo
Effective biomedical data integration depends on automated term normalization, the mapping of natural language biomedical terms to standardized identifiers. This linking of terms to identifiers is essential for semantic interoperability. Large language models (LLMs) show promise for this task but perform unevenly across terminologies. We evaluated both memorization (training-term performance) and generalization (validation-term performance) across multiple biomedical ontologies. Fine-tuning Llama 3.1 8B revealed marked differences by terminology. GO mappings showed strong memorization gains (up to 77% improvement in term-to-identifier accuracy), whereas HPO showed minimal improvement. Generalization occurred only for protein-gene (GENE) mappings (13.9% gain), while fine-tuning for HPO and GO yielded negligible transfer. Baseline accuracy varied by model scale, with GPT-4o outperforming both Llama variants for all terminologies. Embedding analyses showed tight semantic alignment between gene symbols and protein names but weak alignment between terms and identifiers for GO or HPO, consistent with limited lexicalization. Fine-tuning success depended on two interacting factors: identifier popularity and lexicalization. Popular identifiers were more likely encountered during pretraining, enhancing memorization. Lexicalized identifiers, such as gene symbols, enabled semantic generalization. By contrast, arbitrary identifiers in GO and HPO constrained models to rote learning. These findings provide a predictive framework for when fine-tuning enhances factual recall versus when it fails due to sparse or non-lexicalized identifiers.
- North America > United States > Missouri > Greene County > Springfield (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Mental Multi-class Classification on Social Media: Benchmarking Transformer Architectures against LSTM Models
Hasan, Khalid, Saquer, Jamil, Zhang, Yifan
Millions of people openly share mental health struggles on social media, providing rich data for early detection of conditions such as depression, bipolar disorder, etc. However, most prior Natural Language Processing (NLP) research has focused on single-disorder identification, leaving a gap in understanding the efficacy of advanced NLP techniques for distinguishing among multiple mental health conditions. In this work, we present a large-scale comparative study of state-of-the-art transformer versus Long Short-Term Memory (LSTM)-based models to classify mental health posts into exclusive categories of mental health conditions. We first curate a large dataset of Reddit posts spanning six mental health conditions and a control group, using rigorous filtering and statistical exploratory analysis to ensure annotation quality. We then evaluate five transformer architectures (BERT, RoBERTa, DistilBERT, ALBERT, and ELECTRA) against several LSTM variants (with or without attention, using contextual or static embeddings) under identical conditions. Experimental results show that transformer models consistently outperform the alternatives, with RoBERTa achieving 91-99% F1-scores and accuracies across all classes. Notably, attention-augmented LSTMs with BERT embeddings approach transformer performance (up to 97% F1-score) while training 2-3.5 times faster, whereas LSTMs using static embeddings fail to learn useful signals. These findings represent the first comprehensive benchmark for multi-class mental health detection, offering practical guidance on model selection and highlighting an accuracy-efficiency trade-off for real-world deployment of mental health NLP systems.
- Research Report > New Finding (0.88)
- Research Report > Experimental Study (0.87)
Efficient Hate Speech Detection: Evaluating 38 Models from Traditional Methods to Transformers
Abusaqer, Mahmoud, Saquer, Jamil, Shatnawi, Hazim
The proliferation of hate speech on social media necessitates automated detection systems that balance accuracy with computational efficiency. This study evaluates 38 model configurations in detecting hate speech across datasets ranging from 6.5K to 451K samples. We analyze transformer architectures (e.g., BERT, RoBERTa, Distil-BERT), deep neural networks (e.g., CNN, LSTM, GRU, Hierarchical Attention Networks), and traditional machine learning methods (e.g., SVM, CatBoost, Random Forest). Our results show that transformers, particularly RoBERTa, consistently achieve superior performance with accuracy and F1-scores exceeding 90%. Among deep learning approaches, Hierarchical Attention Networks yield the best results, while traditional methods like CatBoost and SVM remain competitive, achieving F1-scores above 88% with significantly lower computational costs. Additionally, our analysis highlights the importance of dataset characteristics, with balanced, moderately sized unprocessed datasets outperforming larger, preprocessed datasets. These findings offer valuable insights for developing efficient and effective hate speech detection systems.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
- (23 more...)
Explainability of CNN Based Classification Models for Acoustic Signal
Faruqui, Zubair, McIntire, Mackenzie S., Dubey, Rahul, McEntee, Jay
Explainable Artificial Intelligence (XAI) has emerged as a critical tool for interpreting the predictions of complex deep learning models. While XAI has been increasingly applied in various domains within acoustics, its use in bioacoustics, which involves analyzing audio signals from living organisms, remains relatively underexplored. In this paper, we investigate the vocalizations of a bird species with strong geographic variation throughout its range in North America. Audio recordings were converted into spectrogram images and used to train a deep Convolutional Neural Network (CNN) for classification, achieving an accuracy of 94.8\%. To interpret the model's predictions, we applied both model-agnostic (LIME, SHAP) and model-specific (DeepLIFT, Grad-CAM) XAI techniques. These techniques produced different but complementary explanations, and when their explanations were considered together, they provided more complete and interpretable insights into the model's decision-making. This work highlights the importance of using a combination of XAI techniques to improve trust and interoperability, not only in broader acoustics signal analysis but also argues for broader applicability in different domain specific tasks.
- North America > United States > New Mexico (0.04)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- North America > United States > Arizona (0.04)
- North America > United States > Maine > Cumberland County > Portland (0.04)
- Leisure & Entertainment (0.66)
- Media > Music (0.48)
Advancing Mental Disorder Detection: A Comparative Evaluation of Transformer and LSTM Architectures on Social Media
Hasan, Khalid, Saquer, Jamil, Ghosh, Mukulika
--The rising prevalence of mental health disorders necessitates the development of robust, automated tools for early detection and monitoring. Recent advances in Natural Language Processing (NLP), particularly transformer-based architectures, have demonstrated significant potential in text analysis. This study provides a comprehensive evaluation of state-of-the-art transformer models (BERT, RoBERT a, DistilBERT, ALBERT, and ELECTRA) against Long Short-T erm Memory (LSTM) based approaches using different text embedding techniques for mental health disorder classification on Reddit. We construct a large annotated dataset, validating its reliability through statistical judgmental analysis and topic modeling. Experimental results demonstrate the superior performance of transformer models over traditional deep-learning approaches. RoBERT a achieved the highest classification performance, with a 99.54% F1 score on the hold-out test set and a 96.05% F1 score on the external test set. Notably, LSTM models augmented with BERT embeddings proved highly competitive, achieving F1 scores exceeding 94% on the external dataset while requiring significantly fewer computational resources. We discuss the implications for clinical applications and digital mental health interventions, offering insights into the capabilities and limitations of state-of-the-art NLP methodologies in mental disorder detection. Mental health conditions such as depression, anxiety, and schizophrenia remain prevalent global health challenges. According to the World Health Organization (WHO), one in eight people will experience a mental illness during their lifetime [1], making early detection and intervention critical. Untreated mental health conditions often lead to serious functional impairment, reduced quality of life, and increased mortality risk [2]. These disorders' significant societal and economic impact underscores the need for effective monitoring and treatment tools [3].
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- (3 more...)
Beyond Architectures: Evaluating the Role of Contextual Embeddings in Detecting Bipolar Disorder on Social Media
Bipolar disorder is a chronic mental illness frequently underdiagnosed due to subtle early symptoms and social stigma. This paper explores the advanced natural language processing (NLP) models for recognizing signs of bipolar disorder based on user-generated social media text. We conduct a comprehensive evaluation of transformer-based models (BERT, RoBERTa, ALBERT, ELECTRA, DistilBERT) and Long Short Term Memory (LSTM) models based on contextualized (BERT) and static (GloVe, Word2Vec) word embeddings. Experiments were performed on a large, annotated dataset of Reddit posts after confirming their validity through sentiment variance and judgmental analysis. Our results demonstrate that RoBERTa achieves the highest performance among transformer models with an F1 score of ~98% while LSTM models using BERT embeddings yield nearly identical results. In contrast, LSTMs trained on static embeddings fail to capture meaningful patterns, scoring near-zero F1. These findings underscore the critical role of contextual language modeling in detecting bipolar disorder. In addition, we report model training times and highlight that DistilBERT offers an optimal balance between efficiency and accuracy. In general, our study offers actionable insights for model selection in mental health NLP applications and validates the potential of contextualized language models to support early bipolar disorder screening.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- (3 more...)
Preserving Privacy, Increasing Accessibility, and Reducing Cost: An On-Device Artificial Intelligence Model for Medical Transcription and Note Generation
Thomas, Johnson, Mudgal, Ayush, Liu, Wendao, Tahiraj, Nisten, Mohammed, Zeeshaan, Diddi, Dhruv
Background: Clinical documentation represents a significant burden for healthcare providers, with physicians spending up to 2 hours daily on administrative tasks. Recent advances in large language models (LLMs) offer promising solutions, but privacy concerns and computational requirements limit their adoption in healthcare settings. Objective: To develop and evaluate a privacy-preserving, on-device medical transcription system using a fine-tuned Llama 3.2 1B model capable of generating structured medical notes from medical transcriptions while maintaining complete data sovereignty entirely in the browser. Methods: We fine-tuned a Llama 3.2 1B model using Parameter-Efficient Fine-Tuning (PEFT) with LoRA on 1,500 synthetic medical transcription-to-structured note pairs. The model was evaluated against the base Llama 3.2 1B on two datasets: 100 endocrinology transcripts and 140 modified ACI benchmark cases. Evaluation employed both statistical metrics (ROUGE, BERTScore, BLEURT) and LLM-as-judge assessments across multiple clinical quality dimensions. Results: The fine-tuned OnDevice model demonstrated substantial improvements over the base model. On the ACI benchmark, ROUGE-1 scores increased from 0.346 to 0.496, while BERTScore F1 improved from 0.832 to 0.866. Clinical quality assessments showed marked reduction in major hallucinations (from 85 to 35 cases) and enhanced factual correctness (2.81 to 3.54 on 5-point scale). Similar improvements were observed on the internal evaluation dataset, with composite scores increasing from 3.13 to 4.43 (+41.5%). Conclusions: Fine-tuning compact LLMs for medical transcription yields clinically meaningful improvements while enabling complete on-device browser deployment. This approach addresses key barriers to AI adoption in healthcare: privacy preservation, cost reduction, and accessibility for resource-constrained environments.
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology (0.38)